-
Notifications
You must be signed in to change notification settings - Fork 416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release-4.11] Bug 2118586: on-prem: improvements on resolv-prepender #3287
[release-4.11] Bug 2118586: on-prem: improvements on resolv-prepender #3287
Conversation
Currently a NetworkManager dispatcher script does not have the correct selinux permission to dbus chat with hostnamed. Work around the issue using systemd-run. See: https://bugzilla.redhat.com/show_bug.cgi?id=2111632 Signed-off-by: Jaime Caamaño Ruiz <[email protected]>
If resolve prepender takes more than NetworkManager timeout, currently 90s, it might fail to bring up devices before we had a chance to process all possible events for a device. This needs to account for different type of events and IPv4 and IPv6 events in case of dual stack and overall take less then the NetworkManager timeout. Signed-off-by: Jaime Caamaño Ruiz <[email protected]>
node-ip can fail if a device is not ready to be bound to. Retry but don't add to the overall timeout more than the NetworkManager timeout (90s) accounting for all the events we need to attend to. Signed-off-by: Jaime Caamaño Ruiz <[email protected]>
Make resolv-prepender wait for nameservers in /run/NetworkManager/resolv.conf in all cases to avoid copying it without them to /etc/resolv.conf Signed-off-by: Jaime Caamaño Ruiz <[email protected]>
Without a properly configured resolv.conf, openshift-dns coredns will fail to run. These pods have a default DNS policy and will use the host resolv.conf, which is the one kubelet gets when it starts. Signed-off-by: Jaime Caamaño Ruiz <[email protected]>
@openshift-cherrypick-robot: Bugzilla bug 2105003 has been cloned as Bugzilla bug 2118586. Retitling PR to link against new bug. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@openshift-cherrypick-robot: This pull request references Bugzilla bug 2118586, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 6 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/test e2e-metal-ipi-ovn-dualstack |
/approve |
/assign @sinnykumari |
should e2e-vsphere and e2e-openstack be green as well? |
/test e2e-openstack Ideally, yes. |
e2e-vsphere and e2e-openstack are still failing. I am adding my approval and will leave this to the on-prem team to decide when this is ready to get merged. Feel free to remove hold when this looks fine. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cybertron, knobunc, openshift-cherrypick-robot, sinnykumari The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I see no history of those jobs passing in this repo/branch. Triggering once more |
Great, the openstack job now failed with the all too familiar |
/test e2e-openstack |
According to the machine log,
It looks like the API server was not reachable at the time the node made the request, however DNS seemed to work, so apparently a different issue (most likely infra related). |
last openstack job had problems in master-0, weird issues with openshift-sdn and must-gather was not able to collect node logs. While this does not look very good, I can't still tie anything specific to these changes. At least the vsphere job passed /test e2e-openstack |
@mandre again the node count issue, no error but there's only 2 workers and both are worker-0, how's that? |
This time, the openstack job failed again with the same Logs from the instance show the following error:
I'm not quite sure what causes it. |
That's because they were created by the |
I am trying to run this on my own with cluster bot. In the meantime... /test e2e-openstack |
cluster bot was able to run the cluster with no issues /test e2e-openstack |
On the last run, now just some tests fail Of those, many I have seen also failing in the test PR #3297 job including then there are some tests that fail in one and not in the other and viceversa. It looks to me that this infra is very sensible to this job but I at least managed to pass it once on my test PR. /test e2e-openstack |
OK, this last run looks better. We're getting a lot of etcd related test failures on openstack recently due to the underlying infra, so the job failures aren't too alarming. I'd be willing to merge now if needed. |
I think we're all in agreement that the openstack failure unlikely to be caused by this patch, so we can go ahead and merge without that job in this instance. |
Removing hold as e2e-openstack test failure is unrelated |
/label cherry-pick-approved |
@openshift-cherrypick-robot: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@openshift-cherrypick-robot: All pull requests linked via external trackers have merged: Bugzilla bug 2118586 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This is an automated cherry-pick of #3271
/assign mandre